Skip to content

Wire JSON ingestion schema extension modules#1215

Open
jwils wants to merge 1 commit into
joshuaw/json-ingestion-extension-modulesfrom
joshuaw/json-ingestion-api-polish
Open

Wire JSON ingestion schema extension modules#1215
jwils wants to merge 1 commit into
joshuaw/json-ingestion-extension-modulesfrom
joshuaw/json-ingestion-api-polish

Conversation

@jwils
Copy link
Copy Markdown
Collaborator

@jwils jwils commented May 27, 2026

Why

Introduce the JSON ingestion schema-definition extension modules after the indexing extension points exist, but before making JSON ingestion the default implementation.

What

  • Add ElasticGraph::JSONIngestion::SchemaDefinition::APIExtension and supporting factory/results/artifact/state/schema-element extension modules
  • Wire JSON ingestion through the factory so core indexing field references and field types are extended in place
  • Keep the existing core JSON Schema behavior active in this intermediate layer
  • Add doctest support for JSON ingestion schema-definition examples without making the extension the default yet

Risk Assessment

Medium - this adds new extension code, but the default behavior remains the existing core JSON Schema implementation in this PR.

References

  • Stacked on Add JSON ingestion indexing extensions #1204.
  • bundle exec rspec elasticgraph-schema_definition/spec/unit/elastic_graph/schema_definition/json_schema_spec.rb elasticgraph-schema_definition/spec/unit/elastic_graph/schema_definition/json_schema_field_metadata_spec.rb elasticgraph-schema_definition/spec/unit/elastic_graph/schema_definition/indexing/json_schema_with_metadata_spec.rb elasticgraph-schema_definition/spec/unit/elastic_graph/schema_definition/factory_spec.rb passed.
  • script/type_check passed.
  • script/lint passed.

Stack

Current PR is marked with ->.

@jwils jwils force-pushed the joshuaw/json-ingestion-api-polish branch from 3436470 to ea857eb Compare May 27, 2026 16:22
@jwils jwils force-pushed the joshuaw/json-ingestion-extension-modules branch from b80bef4 to bce83e3 Compare May 27, 2026 16:22
@jwils jwils force-pushed the joshuaw/json-ingestion-api-polish branch from ea857eb to 7af1788 Compare May 27, 2026 18:43
@jwils jwils force-pushed the joshuaw/json-ingestion-extension-modules branch from bce83e3 to 38d7488 Compare May 27, 2026 18:43
@jwils jwils force-pushed the joshuaw/json-ingestion-extension-modules branch from 38d7488 to 65c8468 Compare May 28, 2026 18:35
@jwils jwils force-pushed the joshuaw/json-ingestion-api-polish branch from 7af1788 to 20ee5dd Compare May 28, 2026 18:35
@jwils jwils force-pushed the joshuaw/json-ingestion-extension-modules branch from 65c8468 to d780944 Compare May 28, 2026 18:44
@jwils jwils force-pushed the joshuaw/json-ingestion-api-polish branch 2 times, most recently from 5b67eda to 28d3b58 Compare May 28, 2026 19:01
@jwils jwils force-pushed the joshuaw/json-ingestion-extension-modules branch 2 times, most recently from 08fa741 to 66a50ff Compare May 28, 2026 19:13
@jwils jwils force-pushed the joshuaw/json-ingestion-api-polish branch 2 times, most recently from 4f85649 to fd4651f Compare May 30, 2026 14:07
@jwils jwils force-pushed the joshuaw/json-ingestion-extension-modules branch 2 times, most recently from e904b45 to 7cd0f7d Compare May 30, 2026 14:26
@jwils jwils force-pushed the joshuaw/json-ingestion-api-polish branch 3 times, most recently from 2e3770f to 9ed0d47 Compare May 30, 2026 14:38
@jwils jwils force-pushed the joshuaw/json-ingestion-extension-modules branch from 3d879aa to 01d66b2 Compare May 30, 2026 20:03
@jwils jwils force-pushed the joshuaw/json-ingestion-api-polish branch from 9ed0d47 to a02256c Compare May 30, 2026 20:03
@jwils jwils force-pushed the joshuaw/json-ingestion-extension-modules branch from 01d66b2 to 4d549ec Compare May 30, 2026 20:17
@jwils jwils force-pushed the joshuaw/json-ingestion-api-polish branch from a02256c to 180c7d3 Compare May 30, 2026 20:17
@jwils jwils marked this pull request as ready for review May 31, 2026 03:58
@jwils jwils force-pushed the joshuaw/json-ingestion-api-polish branch from 180c7d3 to cd6256d Compare May 31, 2026 04:17
Copy link
Copy Markdown
Collaborator

@myronmarston myronmarston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't finished reviewing but wanted to submit my feedback so far.

Comment thread config/site/support/doctest_helper.rb Outdated
@jwils jwils force-pushed the joshuaw/json-ingestion-api-polish branch 2 times, most recently from 55175c5 to 16255f2 Compare May 31, 2026 13:28
Copy link
Copy Markdown
Collaborator

@myronmarston myronmarston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still not done reviewing but here's my next round of feedback.

Copy link
Copy Markdown
Collaborator

@myronmarston myronmarston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Next set of feedback (still not done reviewing!).

@jwils jwils force-pushed the joshuaw/json-ingestion-api-polish branch 3 times, most recently from a1e2bb3 to e28ff9c Compare June 1, 2026 18:39
@jwils jwils force-pushed the joshuaw/json-ingestion-api-polish branch from e28ff9c to 83f6c2a Compare June 1, 2026 18:57
Copy link
Copy Markdown
Collaborator

@myronmarston myronmarston left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made it through the end of my review this time! Almost ready.

Comment on lines +76 to +80
schema_results = json_ingestion_schema_definition_results

@json_schemas_artifact ||= new_yaml_artifact(
JSON_SCHEMAS_FILE,
JSONSchemaPruner.prune(schema_results.current_public_json_schema),
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
schema_results = json_ingestion_schema_definition_results
@json_schemas_artifact ||= new_yaml_artifact(
JSON_SCHEMAS_FILE,
JSONSchemaPruner.prune(schema_results.current_public_json_schema),
@json_schemas_artifact ||= new_yaml_artifact(
JSON_SCHEMAS_FILE,
JSONSchemaPruner.prune(json_ingestion_schema_definition_results.current_public_json_schema),

Since it's memoized it's nice to limit things before the ||=.

Comment on lines +66 to +72
base_artifacts = super

versioned_artifacts = build_desired_versioned_json_schemas(json_schemas_artifact.desired_contents).values.map do |versioned_schema|
new_versioned_json_schema_artifact(versioned_schema)
end

base_artifacts + [json_schemas_artifact] + versioned_artifacts
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
base_artifacts = super
versioned_artifacts = build_desired_versioned_json_schemas(json_schemas_artifact.desired_contents).values.map do |versioned_schema|
new_versioned_json_schema_artifact(versioned_schema)
end
base_artifacts + [json_schemas_artifact] + versioned_artifacts
versioned_artifacts = build_desired_versioned_json_schemas(json_schemas_artifact.desired_contents).values.map do |versioned_schema|
new_versioned_json_schema_artifact(versioned_schema)
end
super + [json_schemas_artifact] + versioned_artifacts

EOS
end

def format_deprecated_elements(deprecated_elements)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to be careful here--there's the potential for a naming conflict. If two EG extension libraries both defined an extension module for SchemaArtifactManager (say, elasticgraph-json_ingestion and elasticgraph-proto-ingestion), and they both defined a method named format_deprecated_elements, they could run into a collision. Each extension would work correctly in isolation but would potentially be buggy when both extensions were used.

That's unlikely to happen for any method that has json_schema in the name...but format_deprecated_elements, missing_field_error_for, missing_type_error_for, missing_necessary_field_error_for, and files_noun_phrase have pretty generic names--I can imagine any extension defining methods with those names and colliding.

This is similar to my comment from last review. Ideally the methods defined on an extension module all fall into one of two buckets:

  1. They offer a new public API
  2. They hook into/override existing methods to customize behavior

I'm not as worried about collisions in these cases. Public APIs are thoughtfully named and unlikely to collide; hooking into existing methods is performed in a composable way with super.

There are two general approaches I'd recommend here:

  1. Instead of an extension module, wrap with a DelegateClass class, like we did with elasticgraph-json_ingestion/lib/elastic_graph/json_ingestion/schema_definition/indexing/field_type/enum.rb in Add JSON ingestion indexing extensions #1204. However, it won't work here because wrapping doesn't let you hook in and override artifacts_from_schema_def (there's no way to intercept that via a wrapping class because it's an internal private method).
  2. Keep the extension module for the methods which fit the criteria above. For everything else, move it into a separate class or module which you delegate to.

I think the 2nd option will work best here but the 1st option may come in handy in other cases.

(Tangentially, I have a similar concern about your results_extension.rb module as well--as it defines so many private methods----but there all the methods have json_schema in the name so a naming collision is less likely. If you get a hankering to give it the same treatment, feel free, but also fine to leave it!).

Comment on lines +5 to +6
module ScalarTypeExtension: ::ElasticGraph::SchemaDefinition::SchemaElements::ScalarType
include HasJSONSchema
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
module ScalarTypeExtension: ::ElasticGraph::SchemaDefinition::SchemaElements::ScalarType
include HasJSONSchema
module ScalarTypeExtension: ::ElasticGraph::SchemaDefinition::SchemaElements::ScalarType
include HasJSONSchema

add_filter "/elasticgraph-local/" unless spec_files_to_run.any? { |f| f.include?("/elasticgraph-local/") }

# The JSON ingestion gem is being introduced by extracting implementation first and moving its tests later.
add_filter "/elasticgraph-json_ingestion/"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you removed this in a follow up PR? (If so can you link it)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants